GENERALIZED NEWTON'S METHOD 
BASED ON GRAPHICAL DERIVATIVES 



o 



T. HOHEISElJJ C. KANZOW 1 , B. S. MORDUKHOVIClfJ and H. PHAN 2 

Abstract. This paper concerns developing a numerical method of the Newton type to solve 
{SJ ' systems of nonlinear equations described by nonsmooth continuous functions. We propose and jus- 

tify a new generalized Newton algorithm based on graphical derivatives, which have never been used 
to derive a Newton-type method for solving nonsmooth equations. Based on advanced techniques 
of variational analysis and generalized differentiation, we establish the well-posedness of the algo- 
rithm, its local superlinear convergence, and its global convergence of the Kantorovich type. Our 
convergence results hold with no semismoothness assumption, which is illustrated by examples. The 
algorithm and main results obtained in the paper are compared with well-recognized semismooth 
and £>-differentiable versions of Newton's method for nonsmooth Lipschitzian equations. 
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1 Introduction 

O 

Newton's method is one of the most powerful and useful methods in optimization and in 
■ the related area of solving systems of nonlinear equations 

O' (1-1) H(x) = 

O ' 

denned by continuous vector- valued mappings H : M n — > M n . In the classical setting when 
if is a continuously differentiable (smooth, C 1 ) mapping, Newton's method builds the 
^ ■ following iteration procedure 

& : 

(1.2) x k+1 := x k + d k for all k = 0, 1, 2, . . . , 

where x° 6 M n is a given starting point, and where d k 6 M n is a solution to the linear 
system of equations (often called "Newton equation") 

(1.3) H'{x k )d = -H{x k ). 

A detailed analysis and numerous applications of the classical Newton's method (|1.2[) . (jl.3p 
and its modifications can be found, e.g., in the books [71 1141 [26] and the references therein. 



institute of Applied Mathematics and Statistics, University of Wiirzburg, Am Hubland, 97074 Wiirzburg, 
Germany (hoheisel@mathematik.uni-wuerzburg.de, kanzow@mathematik.uni-wuerzburg.de) . 

2 Department of Mathematics, Wayne State University, Detroit, MI 48202, USA (boris@math.wayne.edu, 
pmhung@wayne.edu). The research of these authors was partially supported by the US National Science 
Foundation under grants DMS-0603846 and DMS- 1007132 and by the Australian Research Council under 
grant DP-12092508. 



1 



However, in the vast majority of applications — including those to optimization, varia- 
tional inequalities, complementarity and equilibrium problems, etc. — the underlying map- 
ping H in (jl.ip is nonsmooth. Indeed, the aforementioned optimization-related models and 
their extensions can be written via Robinson's formalism of "generalized equations," which 
in turn can be reduced to standard equations of the form above (using, e.g., the projec- 
tion operator) while with intrinsically nonsmooth mappings H; see [8] \19\ [33] [29] for more 
details, discussions, and references. 

Robinson originally proposed (see [32] and also {34] based on his earlier preprint) a 
point-based approximation approach to solve nonsmooth equations which then was 

developed by his student Josephy [TT] to extend Newton's method for solving variational 
inequalities and complementarity problems. Other approaches replace the classical deriva- 
tive H'{xk) in the Newton equation (jl.3p by some generalized derivatives. In particular, 
the .B-differentiable Newton method developed by Pang |27[ [28] uses the iteration scheme 
(|1.2j) with d k being a solution to the subproblem 

(1.4) H'(x k ;d) = -H(x k ), 

where H'(x k ;d) denotes the classical directional derivative of H at x k in the direction d. 
Besides the existence of the classical directional derivative in (jl.4p . a number of strong as- 
sumptions are imposed in [2TJ, [28] to establish appropriate convergence results; see Section 5 
below for more discussions and comparisons. 

In another approach developed by Kummer [16] and Qi and Sun [51] . the direction d k 
in (|1.2j) is taken as a solution to the linear system of equations 

(1.5) A k d=-H(x k ), 

where A k is an element of Clarke's generalized Jacobian dcH(xk) of a Lipschitz continuous 
mapping H. In [30], Qi suggested to replace A^ G &cH{x k ) in (|1.5|) by the choice of A^ from 
the so-called S-subdifferential dBH(x k ) of H at x k , which is a proper subset of dcH(x k ); 
see Section 4 for more details. We also refer the reader to [U H5j [M] and bibliographies 
therein for wide overviews, historical remarks, and other developments on Newton's method 
for nonsmooth Lipschitz equations as in (jl.ip and to |13| for some recent applications. 

It is proved in [31] and [30] that the Newton type method based on implementing the 
generalized Jacobian and 5-subdifferential in (jl.5p . respectively, superlinearly converges to 
a solution of (jl.ip for a class of semismooth mappings H ; see Section 4 for the definition and 
discussions. This subclass of Lipschitz continuous and directionally differentiable mappings 
is rather broad and useful in applications to optimization-related problems. However, not 
every mapping arising in applications (from both theoretical and practical viewpoints) is 
either directionally differentiable or Lipschitz continuous. The reader can find valuable 
classes of functions and mappings of this type in |24j [35] and overwhelmingly in spectral 
function analysis, eigenvalue optimization, studying of roots of polynomials, stability of 
control systems, etc.; see, e.g., [3j and the references therein. 

The main goal and achievements of this paper are as follows. We propose a new Newton- 
type algorithm to solve nonsmooth equations (jl.ip described by general continuous map- 
pings H that is based on graphical derivatives. It reduces to the classical Newton method 
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(jl.3p when H is smooth, being different from previously known versions of Newton's method 
in the case of Lipschitz continuous mappings H. Based on advanced tools of variational 
analysis involving metric regularity and coderivatives, we justify well-posedness of the new 
algorithm and its superlinear local and global (of the Kantorovich type) convergence under 
verifiable assumptions that hold for semismooth mappings but are not restricted to them. 
Detailed comparisons of our algorithm and results with the semismooth and i?-differentiable 
Newton methods are given and certain improvements of these methods are justified. 

Note metric regularity and related concepts of variational analysis has been employed 
in the analysis and justification of numerical algorithms starting with Robinson's seminal 
contribution; see, e.g., [Tl 1181 |2"5] and their references for the recent account. However, we 
are not familiar with any usage of graphical derivatives and coderivatives for these purposes. 

The rest of the paper is organized as follows. In Section 2 we present basic definitions 
and preliminaries from variational analysis and generalized differentiation widely used for 
formulations and proofs of the main results. 

Section 3 is devoted to the description of the new generalized Newton algorithm with 
justifying its well-posedness/solvability and establishing its superlinear local and global 
convergence under appropriate assumptions on the underlying mapping H. 

In Section 4 we compare our algorithm with the scheme of (II. 5j) . We also discuss in detail 
the major assumptions made in Section 3 deriving sufficient conditions for their fulfillment 
and comparing them with those in the semismooth Newton methods. 

Section 5 contains applications of our algorithm to the i?-differentiable Newton method 
(|l,4p with largely relaxed assumptions in comparison with known ones. In Section 6 we 
give some concluding remarks and discussions on further research. 

Our notation is basically standard in variational analysis and numerical optimization; 
cf. [8j [Ml EH]. Recall that, given a set-valued mapping F: M n M m , the expression 



defines the Painleve-Kuratowski upper/outer limit of F clS X 7 X. Let us also mention that 
the symbols cone0 and co0 stand, respectively, for the conic hull and convex hull of the 
set in question, that dist(x; 0) denotes the Euclidean distance between a point x £ M n and 
a set 0, and that the notation A T signifies the matrix transposition. As usual, B £ {x) stands 
for the closed ball centered at x with radius e > 0. 

2 Tools of Variational Analysis 

In this section we briefly review some constructions and results from variational analysis 
and generalized differentiation widely used in what follows. The reader may consult the 
texts [H Ell (Ml [36] for more details and additional material. 

Given a nonempty set C M n and a point x £ 0, the (Bouligand-Severi) tan- 



(1.6) 




3 Xk — > x and yk —> y as k — > oo with 




3 



gent/ contingent cone to at x is defined by 

- f 

(2.1) T(x; 0) := Limsup 

via the outer limit (|1 .6j) . This cone is often nonconvex while its polar /dual cone 

(2.2) iV(x;0) := {p G JR n | (p,u) < for all « G T(x;0)} 
is always convex and can be intrinsically described by 

N(x; 0) = jp G M n limsup ^ X ~-) < o}, x G 0, 



_ 



where the symbol x — >• x signifies that x — > x with x G 0. The construction (|2,2p is known 
as the prenormal cone or the Frechet/ 'regular normal cone to at x G 0. For convenience 
we put iV(x;0) = 0ifx^0. Observe that the prenormal cone (|2.2p may not have natural 
properties of generalized normals in the case of nonconvex sets 0; e.g., it often happens 
that N(x; 0) = {0} when x is a boundary point of and the cone (|2.2p does not possesses 
required calculus rules. The situation is dramatically improved when we consider a robust 
regularization of (I2.2D via the outer limit (II. 6h and arrive at the construction 

(2.3) N(x; 0) := Lim sup N(x; 0) 

x— >x 

known as the (limiting, basic, Mordukhovich) normal cone to at x G 0. If is locally 
closed around x, the basic normal cone (12. 3p can be equivalently described as 

N(x; 0) = Limsup [cone(x - n(x;0))], x G 0, 

a;— >x 

via the Euclidean projector Il(-;0) on 0; this was in fact the original definition of the 
normal cone in [21]. Despite its nonconvexity, the normal cone (|2.3p and the correspond- 
ing subdifferential and coderivative constructions for extended-real-valued functions and 
set-valued mappings enjoy comprehensive calculus rules, which are particularly based on 
variational/extremal principles of variational analysis. 

Consider next a set- valued mapping F: M n JR m with the graph 

gphF := {(x,y) G M n x M m \ y G F(x)} 

and define the graphical derivative and coderivative constructions generated by the tangent 
and normal cones, respectively. Given (x,y) G gphF, the graphical/contingent derivative 
of F at (x, y) is introduced in [2] as a mapping DF(x, y) : JR n M m with the values 

(2.4) DF{x,y)(z) := {w € M m \ (z,w) G T((x, y); gphF) }, z G M n , 

defined via the contingent cone (12. ip to the graph of F at the point (x,y); see [3l [35] for 
various properties, equivalent representation, and applications. The coderivative of F at 
(x, y) G gphF is introduced in [22] as a mapping D*F(x, y) : M m M n with the values 

(2.5) D*F(x,y){v) := {u G M n \ (u,-v) G N((x,y);gphF)}, v G M m , 
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defined via the normal cone (|2.3[) to the graph of F at (x, y); see [MIES] for extended calculus 
and a variety of applications. We drop y in the graphical derivative and coderivative notation 
when the mapping in question is single- valued at x. Note that the graphical derivative and 
coderivative constructions in (|2.4p and (|2.5p are not dual to each other, since the basic 
normal cone (|2.3p is nonconvex and hence cannot be tangentially generated. 

In this paper we employ, together with (|2.4p and (|2.5p . the following modified derivative 
construction for mappings, which seems to be new in generality although constructions of 
this (radial, Dini-like) type have been widely used for extended-real-valued functions. 

Definition 2.1 (restrictive graphical derivative of mappings). Let F: M n M m , 

and let (x,y) G gphF. Then a set-valued mapping DF(x,y): M n M m given by 

(2.6) DF{x,y){z) := Limsup F(x + ^ - z G M n , 

40 t 

is called the restrictive graphical derivative of F at (x,y). 

The next proposition collects some properties of the graphical derivative (|2.4p and its 
restrictive counterpart (|2.6p needed in what follows. 

Proposition 2.2 (properties of graphical derivatives). Let F: M n M m , and let 

(x,y) G gphF. Then the following assertions hold: 

(a) We have DF(x,y)(z) C DF(x,y)(z) for all z£M n . 

(b) There are inverse derivative relationships 

DF{x,yY l = DF~ l (y,x) and DF{x,y)~ l = DF~ l {y,x). 

(c) If F is single-valued and locally Lipschitzian around x, then 

DF(x)(z) = DF{x){z) for all z G M n . 

(d) If F is single-valued and directionally differentiable at x, then 

DF{x){z) = {F'(x;z)} for all z G M n . 

(e) If F is single-valued and Gateaux differentiable at x with the Gateaux derivative 
F' G (x), then we have 

DF{x){z) = {F' G (x)z} for all z G M n . 

(f) If F is single-valued and (Frechet) differentiable at x with the derivative F'{x), then 

DF(x){z) = {F'{x)z} for all z G M n . 
Proof. It is shown in [35j 8(14)] that the graphical derivative (j2.4j) admits the representation 

(2.7) DF{x, y) (z) = Lim sup F ( x + th )~y ^ zE ]R n . 

40, h^z t 

The inclusion in (a) is an immediate consequence of Definition 12.11 and representation (|2.7p . 
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The first equality in (b), observed from the very beginning j2J, easily follows from defi- 
nition (|2.4p . We can similarly check the second one in (b). 

To justify the equality in (c), it remains to verify by (a) the opposite inclusion 'D' when 
F is single- valued and locally Lipschitzian around x. In this case fix z G M n , pick any 
w G DF(x)(z), and find by representation (|2.7p sequences h k — > z and t k i such that 



F(x + tkh k )-F(x) 
tk 



w as k — > oo. 



The local Lipschitz continuity of F around x with constant L > implies that 



F(x + t k h k )-F(x) F{x + t k z)-F{x) 



tk 



tk 



F(x + t k h k )- F{x + t k z) 



tk 



< L\\h k -z 

for all k G IN sufficiently large, and hence we have the convergence 

t k z)-F{x) 



Fix 



tk 



w as k — > oo. 



Thus w G DF(x)(z), which justifies (c). Assertions (d) and (e) follow directly from the 
definitions. Finally, assertion (f) is implied by (e) in the local Lipschitzian case (c) while 
it can be easily derived from the (Frechet) differentiability of F at x with no Lipschitz 
assumption; see, e.g., [351 Exercise 9.25(b)]. A 

Proposition 12.21 reveals important differences between the graphical derivative (I2.4p 
and the coderivative (|2.5p . Indeed, assertions (c) and (d) of this proposition show that 
the graphical derivative of locally Lipschitzian and directionally differentiable mappings 
F : M n — > M m is always single-valued. At the same time, the coderivative single- valuedness 
for locally Lipschitzian mappings is equivalent to the strict/strong Frechet differentiability 
of F at the point in question; see \24\ Theorem 3.66]. It follows from the well-known formula 



(2- 



coD*F(x)(z) = {A T z\ A G d c F{x)} 



that the latter strict differentiability condition characterizes also the single-valuedness of 
the generalized Jacobian of F at x. 

In fact, in the case of F = (/i, . . . , f m ) : M n — > JR m being locally Lipschitzian around x 
the coderivative (|2.5p admits the subdifferential description 



(2.9) 



D*F(x){z) = dyY^Zifi) (x) for any z = (zi 



G M« 



i=l 



where the (basic, limiting, Mordukhovich) subdifferential df(x) of a general scalar function 
/ at x is defined geometrically by 



(2.10) 



df{x) := {p G M n \ (p,-l) G N((x,f(x));epif)} 



via the normal cone (|2.3p to the epigraph epi/ := {(x,fi) G M n+1 \ fi > f(x)} and admits 
analytical descriptions in terms of the outer limit (|1.6p of the Frechet /regular and proximal 
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subdifferentials at points nearby; see \24\ [35] with the references therein. Note also that 
the basic subdifferential (|2.10p of a continuous function / can be also described via the 
coderivative of / by df(x) = D*f(x)(l); see [2U Theorem 1.80]. 

Finally in this section, we recall the notion of metric regularity and its coderivative 
characterization that play a significant role in the paper. A mapping F: M" =^ M m is 
metrically regular around (x,y) G gphF if there are neighborhoods U of x and V of y as 
well as a number > such that 

(2.11) dist(x;F~ 1 (y)) < fidist(y;F(x)) for all x G U and y G V. 

Observe that it is sufficient to require the fulfillment of (|2.1ip just for those y £V satisfying 
the estimate dist(y; F(x)) < 7 for some 7 > 0; see [231 Proposition 1.48]. 

We will see below that metric regularity is crucial for justifying the well-posedness of 
our generalized Newton algorithm and establishing its local and global convergence. It is 
also worth mentioning that, in the opposite direction, a Newton- type method (known as the 
Lyusternik-Graves iterative process) leads to verifiable conditions for metric regularity of 
smooth mappings; see, e.g., the proof of |24[ Theorem 1.57] and the commentaries therein. 
The latter procedure is replaced by variational/extremal principles of variational analysis 
in the case of nonsmooth and set- valued mappings under consideration; cf. |1Q |. I24| . [35] . 

In this paper we broadly use the following coderivative characterization of the metric 
regularity property for an arbitrary set-valued mapping F with closed graph, known also as 
the Mordukhovich criterion (see [23[ Theorem 3.6], [35[ Theorem 9.45], and the references 
therein): F is metrically regular around (x,y) if and only if the inclusion 

(2.12) G D*F(x, y){z) implies that z = 0, 
which amounts the kernel condition kerD*F(x, y) = {0}. 

3 The Generalized Newton Algorithm 

This section presents the main contribution of the paper: a new generalized Newton method 
for nonsmooth equations, which is based on graphical derivatives. The section consists of 
three parts. In Subsection 3.1 we precisely describe the algorithm and justify its well- 
posedness/solvability. Subsection 3.2 contains a local superlinear convergence result under 
appropriate assumptions. Finally, in Subsection 3.3 we establish a global convergence result 
of the Kantorovich type for our generalized Newton algorithm. 

3.1 Description and Justification of the Algorithm 

Keeping in mind the classical scheme of the smooth Newton method in (jl.2p . (jl.3p and 
taking into account the graphical derivative representation of Proposition [2?2T f ). we propose 
an extension of the Newton equation (jl.3p to nonsmooth mappings given by: 

(3.1) — H(x k ) G DH(x k )(d k ), k = 0, 1, 2, . . . . 

This leads us to the following generalized Newton algorithm to solve (jl.ll) : 
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Algorithm 3.1 (generalized Newton's method). 
Step 0: Choose a starting point x° G M n . 

Step 1: Check a suitable termination criterion. 

Step 2: Compute d k G M n such that holds. 

Step 3: Set x k+1 := x k + d k , k <- k + I, and go to Step 1. 

The proposed Algorithm 13.11 does not require a priori any assumptions on the underlying 
mapping H : M n —> M n in (|l.lj) besides its continuity, which is the standing assumption in 
this paper. Other assumptions are imposed below to justify the well-posedness and (local 
and global) convergence of the algorithm. Observe that Proposition I2.2f c.d) ensures that 
Algorithm 13.11 reduces to scheme (jl.4p in the B-differentiable Newton method provided 
that H is directionally differentiable and locally Lipschitzian around the solution point 
in question. In Section 5 we consider in detail relationships with known results for the B- 
differentiable Newton method, while Section 4 compares Algorithm 13. II and the assumptions 
made with the corresponding semismooth versions in the framework of (jl.5p . 

To proceed further, we need to make sure that the generalized Newton equation (|3.ip is 
solvable, which is a major part of the well-posedness of Algorithm 13. 1[ The next proposition 
shows that an appropriate assumption to ensure the solvability of (13. ip is metric regularity. 



Proposition 3.2 (solvability of the generalized Newton equation). Assume that 
H : M n —> M n is metrically regular around x with y = H(x) in (12. lip , i.e., we have 
ker D*H(x) = {0}. Then there is a constant e > such that for all x G B e {x) the equation 

(3.2) ~H[x) G DH{x){d) 

admits a solution d G JR n . Furthermore, the set S(x) of solutions to (|3.2p is computed by 

, ^ , ^ H- l (H(x)+th) -x „ 

(3.3) S(x)= Limsup '- / 0. 

40, h^r-H{x) t 

Proof. By the assumed metric regularity (|2.1ip of H we find a number \i > and neigh- 
borhoods U of x and V of H{x) such that 

dist(x;H^ 1 (y)) < fidist(y;H(x)) for all xeU and y G V. 

Pick now an arbitrary vector x G U and select sequences hk — > —H(x) and tk 4 as k — > oo. 
Suppose with no loss of generality that H{x) + t^h^ G V for all k G IN. Then we have 

dist(x; H-\H(x) +t k h k )) < (J,t k \\h k \\, k G IV, 

and hence there is a vector G H^ 1 (H(x) + t^hf*) such that — x\\ < for all 

A; G IV. This shows that the sequence {\\uk — x\\/tf.} is bounded, and thus it contains a 
subsequence that converges to some element d G M n . Passing to the limit as k — > oo and 
recalling the definitions of the outer limit fjl ,6|) and of the tangent cone f)2. 1 [) . we arrive at 

/ , ,n gph-ff — (x, H(x)) , x 

(d,-H(x)) G Limsup^ ^ — ^ = T((x, (x)); gph H) , 

40 * 
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which justifies the desired inclusion (|3.2|) . The solution representation (|3.3p follows from 
(|2.7p and Proposition 12.2( b) in the case of single- valued mappings, since 

S{x) =DH{x)~ 1 { r H{x) 

due to ()3.2|) . This completes the proof of the proposition. A 

3.2 Local Convergence 

In this subsection we first formulate major assumptions of our generalized Newton method 
and then show that they ensure the superlinear local convergence of Algorithm 13.11 

(HI) There exist a constant C > 0, a neighborhood U of x, and a neighborhood V of the 
origin in M n such that the following holds: 

For all x G U, z G V, and for any d G JR n with —H(x) G DH(x)(d) there is a vector 
w G DH(x)(z) such that 

C\\d - z\\ < \\w + H(x)\\ + o(||x - x\\). 

(H2) There exists a neighborhood U of x such that for all u G DH(x)(x — x) we have 

\\H(x) - H(x) + v\\ = o(\\x - x\\). 

A detailed discussion of these two assumptions and sufficient conditions for their fulfillment 
are given in Section 4. Note that assumption (H2) means, in the terminology of [51 Defini- 
tion 7.2.2] focused on locally Lipschitzian mappings H, that the family {DH(x)} provides 
a Newton approximation scheme for H at x. 

Now we establish our principal local convergence result that makes use of the major 
assumptions (HI) and (H2) together with metric regularity. 

Theorem 3.3 (superlinear local convergence of the generalized Newton method). 

Let x G M n be a solution to (jl.ip for which the underlying mapping H : M n — > M n is metri- 
cally regular around x and assumptions (HI) and (H2) are satisfied. Then there is a number 
e > such that for all x° G B e (x) the following assertions hold: 

(i) Algorithm \3.l \ is well defined and generates a sequence {x k } converging to x. 

(ii) The rate of convergence x k — >■ x is at least superlinear. 

Proof. To justify (i), pick e > such that assumptions (HI) and (H2) hold with U := B e (x) 
and V := B £ (0) and such that Proposition 13.21 can be applied. Then we choose a starting 
point x° G B e (x) and conclude by Proposition 13.21 that the subproblem 

-#(x°) G DH(x°){d) 

has a solution d°. Thus the next iterate x 1 := x° + d° is well defined. Let further z° := x — x° 
and get \\z°\\ < e by the choice of the starting point x . By assumption (HI), find a vector 
w° G DH(x°)(z°) such that 

CWx 1 - x\\ = CUx 1 - x°) - {x - x°)\\ = C\\d° - z°\\ < \\w° + H(x°)\\+o{\\x -x\\). 
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Taking this into account and employing assumption (H2), we get the relationships 

C\\x l -x\\ < \\w° + H(x°)\\+o(\\x° -x\\) 

= \\H(x )-H(x) + w°\\+o(\\x -x\\) 
= o(\\x°-x\\) 

< ^ 2 1 1 X X 1 1 . 

which imply that Hx 1 — x\\ < ^\\x° — x\\. The latter yields, in particular, that x 1 G B e (x). 
Now standard induction arguments allow us to conclude that the iterative sequence {x k } 
generated by Algorithm 13.11 is well defined and converges to the solution x of (jl.ip with at 
least a linear rate. This justifies assertion (i) of the theorem. 

Next we prove assertion (ii) showing that the convergence x k — > x is in fact superlinear 
under the validity of assumption (H2). To proceed, we basically follow the proof of assertion 
(i) and construct by induction sequences {d k } satisfying 

-H{x k ) G DH(x k ){d k ) for all k G IV, 

{z k } with z k := x — x k , and {w k } with w k G DH(x k )(z k ) such that 

C\\x k+1 - x\\ < \\w k + H(x k )\\ + o(||x fc - x\\), k G IV. 

Applying then assumption (H2) gives us the relationships 

C||x fc+1 - x\\ < \\H{x k ) - H(x) + w k \\+ o(\\x k - x\\) = o(\\x k - x\\), 

which ensure the superlinear convergence of the iterative sequence {x k } to the solution x 
of (ll.lj) and thus complete the proof of the theorem. A 

3.3 Global Convergence 

Besides the local convergence in the classical Newton method based on suitable assumptions 
imposed at the (unknown) solution of the underlying system of equations, there are global 
(or semi-local) convergence results of the Kantorovich type |12j for smooth systems of 
equations which show that, under certain conditions at the starting point x° and a number 
of assumptions to hold in a suitable region around x°, Newton's iterates are well defined and 
converge to a solution belonging to this region; see [12] for more details and references. 
In the case of nonsmooth equations results of the Kantorovich type were obtained in 
[31|. 134] for the corresponding versions of Newton's method. Global convergence results of 
different types can be found in, e.g., [6j ?,[9j[28] an d their references. 

Here is a global convergence result for our generalized Newton method to solve 

Theorem 3.4 (global convergence of the generalized Newton method). Let x° be 

a starting point of Alaorithm \3.1[ and let 

(3.4) 0:={x£ ]R n \ \\x-x°\\ < r} 

with some r > 0. Impose the following assumptions: 
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(a) The mapping H : M n —> M n in (jl.ip is metrically regular on with modulus \i > 0, 
i.e., it is metrically regular around every point x £ with the same modulus [i. 

(b) The set-valued map DH{x){z) uniformly on converges to {0} as z — > in the sense 
that: for all e > there is 5 > such that 

\\w\\ < e whenever w £ DH(x)(z), \\z\\ < 5, and x 6 0. 

(c) There is a £ (0, swc/i £/iai 

(3.5) /x||#(x°)|| < r(l - a/z) 
and /or a// x, y G we have the estimate 

(3.6) ll-^X^O — H(y) ~ v \\ < a \\ x ~ u\\ whenever v G DH(x)(y — x). 

Then Alaorithm \3.1\ is well defined, the sequence of iterates {x k } remains in and converges 
to a solution x £ of (jl.ll) . Moreover, we have the error estimate 

(3.7) \\x k - x\\ < -^—\\x k - x fc_1 || for all k £ IN. 

1 — a/i 

Proof. The metric regularity assumption (a) allows us to employ Proposition 13.21 and, for 
any x £ and d £ M n satisfying the inclusion —H(x) £ DH(x)(d), to find sequences of 
h k — > —H(x) and t k \, as k — > oo such that 



lim 

k— ¥oo 



H- 1 (H(x)+t k h k ) -x 



< lim /J,\\h k \\ = n\\H(x) 

k— >oo 



In view of assumption (|3.5p in (c) and the iteration procedure of the algorithm, this implies 

Hx 1 - x°|| = ||d°|| < n\\H(x°)\\ < r(l - an), 

which ensures that x 1 £ due the form of in (|3.4p and the choice of a. Proceeding 
further by induction, suppose that x , . . . ,x k £ and get the relationships 

\\x k+1 -x k \\ = \\d k \\ < v\\H(x k )\\ 

< fi\\H(x k ) - H{x k ~ l ) + ^(x^- 1 )!! 

< an\\x k - x fc - X || (using §M and - ^(x^ 1 ) £ DH(x k - 1 )(x k - a;*" 1 )) 

< (a/i) fc ||x 1 - x°|| < r(afi) k (l - a/x), 
which imply the estimates 

k k 



< " ' - x°\\ < — x^W < r(a/i) J (l — afi) < 



,./-■+.) _ ,0, / \\ x j+1 

j=0 j=0 

and hence justify that x k+1 £ 0. Thus all the iterates generated by Algorithm 13.11 remain 
in 0. Furthermore, for any natural numbers k and m, we have 

k+m k+m 



./•' || < ^ \\x j+1 - x J '|| < r(afi) j (l - a/x) < r(afi) k , 



„fc+m+l ^k 

j=k j=k 
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which shows that the generated sequence {x k } is a Cauchy sequence. Hence it converges to 
some point x that obviously belongs to the underlying closed set (|3.4p . 

To show next that x is a solution to the original equation (jl.ip . we pass to the limit as 
k — > oo in the iterative inclusion 

(3.8) — H(x k ) G DH(x k )(x k+1 — x k ), k € IN. 

It follows from assumption (b) that ]im.fc_ > . ao H(x k ) = 0. The continuity of H then implies 
that H(x) = 0, i.e., x is a solution to (jl.ip . 

It remains to justify the error estimate (|3.7p . To this end, first observe by (|3.5p that 

k+m m 

\\x k+m+i - x k \\ < - xj \\ < Y.^ j+1 \\ xk - xk ~ l w ^ -^\\ xk - * k ~ l \\ 

j=k j=o afl 

for all k,m £ JN. Passing now to the limit as m — > oo, we arrive at (13. 7p thus completes 
the proof of the theorem. A 



4 Discussion of Major Assumptions and Comparison with 
Semismooth Newton Methods 

In this section we pursue a twofold goal: to discuss the major assumptions made in Section 3 
and to compare our generalized Newton method based on graphical derivatives with the 
semismooth versions of the generalized Newton method developed in [3Q|. [31] . As we will 
see from the discussions below, these two aims are largely interrelated. 

Let us begin with sufficient conditions for metric regularity in terms of the construc- 
tions used in the semismooth versions of the generalized Newton method. Given a lo- 
cally Lipschitz continuous vector-valued mapping H : M n —> M m , we have by the classical 
Rademacher theorem that the set of points 

(4.1) S H ■= {x £ M n \ H is differentiable at x) 

is of full Lebesgue measure in M n . Thus for any mapping H : M n — > M m locally Lipschitzian 
around x the set 

(4.2) d B H{x) := { lim H'(x k ) 3{x k } C S H with x k -> x\ 

is nonempty and obviously compact in M m . It was introduced in [38J for m = 1 as the set 
of "almost-gradients" and then was called in [30J the B-subdifferential of H at x. Clarke's 
generalized Jacobian [5] of H at x is defined by the convex hull 

(4.3) d c H(x) :=co{d B H(x)}. 

We also make use of the Thibault derivative/limit set |39| (called sometimes the "strict 
graphical derivative" [35J) of H at x defined by 



(4.4) D T H(x)(z) := Lim sup ^ + t£) H ^ , zdM n . 



X—¥X 

tin 
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Observe the known relationships |15t [39] between the above derivative sets 



(4.5) d B H{x)z C D T H(x)(z) C d c H{x)z, z e M n . 

The next result gives a sufficient condition for metric regularity of Lipschitzian map- 
pings in terms of the Thibault derivative (j4.4p . It can be derived from the coderivative 
characterization of metric regularity f)2 . 12|) , while we give here a direct independent proof. 

Proposition 4.1 (sufficient condition for metric regularity in terms of Thibault's 
derivative). Let H : M n — > M n be locally Lipschitzian around x, and let 

(4.6) 0<£D T H(x)(z) whenever z^O. 
Then the mapping H is metrically regular around x. 

Proof. Kummer's inverse function theorem [171 Theorem 1.1] ensures that condition (|4.6p 
implies (actually is equivalent to) the fact that there are neighborhoods U of x, and V of 
H(x) such that the mapping H : U — > V is one-to-one with a locally Lipschitzian inverse 
H^ 1 : V — > U. Let fi > be a Lipschitz constant of H^ 1 on V . Then for all x £ U and 
y G V we have the relationships 

dist(x; H-^y)) = \\x-H-\y)\\ 

= WH-^H^-H-Hy)]] 
< n\\H(x)-y\\ 
= /xdist(y;#(x)), 

which thus justify the metric regularity of H around x. A 

To proceed further with sufficient conditions for the validity of our assumption (HI), we 
first introduce the notion of directional boundedness. 

Definition 4.2 (directional boundedness). A mapping H : IR n — > IR m is said to be 

DIRECTION ALLY BOUNDED around X if 



(4.7) limsup 

40 

for all x near x and for all z € M n . 



H(x + tz)-H{x) 



t 



< oo 



It is easy to see that if H is either directionally differentiable around x or locally Lip- 
schitzian around this point, then it is directionally bounded around x. The following ex- 
ample shows that the converse does not hold in general. 

Example 4.3 (directional bounded mappings may not be directionally differen- 
tiable). Define a real function H : JR — > JR by 



H{x) :-- 



xsin (i) if x ^ 0, 
if x = 0. 
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It is easy to see that this function is not directionally differentiable at x = 0. However, it is 
directionally bounded around x. Indeed, for any x ^ near x condition (|4.7p holds because 
H is simply differentiable at x 7^ 0. For x = we have 



lim sup 

40 



H(tz) - H(0) 



lim sup = hm sup 



40 



40 



: sin I — 

tz 



\z\ < 00. 



The next proposition and its corollary present verifiable sufficient conditions for the 
fulfillment of assumption (HI). 

Proposition 4.4 (assumption (HI) from metric regularity). Let H : M n — > ]R n , 

and let x be a solution to (jl.ip . Suppose that H is metrically regular around x (i.e., 
keiD*H(x) = 0), that it is directionally bounded and one-to-one around this point. Then 
assumption (HI) is satisfied. 

Proof. Recall that the metric regularity of H around x is equivalent to the condition 
kerD*H(x) = {0} by the coderivative criterion (|2.12p . Let U C M n be a neighborhood of 
x such that H is metrically regular and one-to-one on U. Choose further a neighborhood 
V C M n of H(x) = from the definition of metric regularity of H around x. Then pick 
x G U, z G V and an arbitrary direction d G M n satisfying —H(x) G DH(x)(d). Employing 
now Proposition 13.21 we get 



d G Lim sup 

h-+-H(x), 40 



H- l (H{x) + th) 



By the local single-valuedness of H 1 and the metric regularity of H around x there exists 
a number \i > such that 



H- 1 {H(x)+th) - x 



for all t > sufficiently small. It follows that 
H- 1 (H(x) + th) 



H{x) + th- H(x + tz) 



H(x + tz) - H(x) 



\d — z\\ < lim sup 



X 



< fj, lim sup 



H(x + tz) - H{x) 



h 



< 00 



by the directional boundedness of H around x. The boundedness of the family 

H(x + tz) - H(x) ■ 



{v(t) :-- 



40, 



allows us to select a sequence tk 4 such that v(tk) — > w for some w G JR n . By passing to 
the limit above as k — > 00 and employing Definition 12.11 we get that 

w G DH(x)(z) and -\\d - z\\ < \\w + H(x)\\, 



which completes the proof of the proposition. 



A 
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Corollary 4.5 (sufficient conditions for (HI) via Thibault's derivative). Let x be 

a solution to (jl.ip . where H : JR n — > JR n is locally Lipschitzian around x and such that 
condition (|4.6p holds, which is automatic when det^4 7^ for all A G dcH(x). Then (HI) 
is satisfied with H being both metrically regular and one-to-one around x. 

Proof. Indeed, both metric regularity and bijectivity of H around x assumed in Proposi- 
tion S3] follow from Proposition 14. II and its proof. Nonsingularity of all A G dcH(x) clearly 
implies (j4.6j) by the second inclusion in (|4.5p . A 

Note that other conditions ensuring the fulfillment of assumption (HI) for Lipschitzian 
and non-Lipschitzian mappings H : M n —> M n can be formulated in terms of Warga's 
derivate containers by [401 Theorems 1 and 2] on "fat homeomorphisms" that also imply 
the metric regularity and one-to-one properties of H. 

Next we proceed with the discussion of assumption (H2) and present, in particular, 
sufficient conditions for their fulfillment via semismoothness. First observe the following. 

Proposition 4.6 (relationship between graphical derivative and generalized Ja- 
cobian). Let H : M n — > M rn be locally Lipschitzian around x. Then we have 

(4.8) DH(x)(z) C d c H(x)z for all z G M n . 

Proof. Pick w G DH(x)(z) and get by Proposition 12.2( c) and Definition 12.11 a sequence of 
t k 4 as k — > 00 such that 

(4.9) w=1 . m ^ + 



t k 

It follows from (SJ Proposition 2.6.5] that 
H{x + t k z) - H(x 



e co{dcH[x,x + t k z]}z for all k € IN. 



Applying to the latter the classical Caratheodory theorem, we find scalars 7^ G [0,ifc], 
G [0, 1] and matrices A\ G dcH(x + 7*2) for i = 1, . . . , m + 1 such that 

H(x + t k z) - H(x) [V^ xk A k~ 



[ x t A i z and Yl x i = 1 for a11 k e m - 



tk . 

i=l i=l 

Due to the boundedness of the sequences {\^} k& ]N, the convergence x + 'yfz — > x as k — > 00 
for alH = 1, . . . ,m + 1, and the outer/upper semicontinuity of the mapping x h-> dcH(x) 
proved in (5j Proposition 2.6.2] we have that the sequences {Af} are bounded as well. 
Hence there are subsequences of these sequences (without relabelling), scalars Aj G [0,1], 
and matrices Ai as i = 1, . . . , m + 1 such that 

m+1 

x) as A; — )• 00. 
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By (|4.9p and the subsequent relationships therein, we get 

■m+l m+1 

w = lim X i A i z = \ y~] X i A i z G co{d c H(x)}z = d c H(x)z 

1=1 8=1 

and thus complete the proof of the proposition. A 

Inclusion (|4.8p — which may be strict as illustrated by Example 14.71 below — shows that 
our generalized Newton Algorithm 13 . 1 1 based on the graphical derivative provides in the case 
of Lipschitz equations (jl.ip a more accurate choice of the iterative direction d k via (|3.ip in 
comparison with the iterative relationship 

(4.10) - H(x k ) G d c H(x k )d k , k = 0,1,2,..., 

used in the semismooth Newton method [31 j and related developments [15\ [T6] based on 
the generalized Jacobian. If in addition to the assumptions of Proposition 14.61 the mapping 
H is directionally differentiable at x, then DH{x){z) = {H'(x; z)} by Proposition I2.2( c.d) . 
Thus in this case we have from Proposition 14.61 that for any z £ M n there is A £ dcH(x) 
such that H'(x; z) = Az, which recovers a well-known result from |3H Lemma 2.2]. 

The following example shows that the converse inclusion in Proposition 14.61 is not sat- 
isfied in general even with the replacement of the set DH(x)(z) in (|4.8p by its convex hull 
coDH(x)(z) in the case of real functions. Furthermore, the same holds if we replace the 
generalized Jacobian in (|4.8p by the smaller 5-subdifferential d B H(x) from (|4.2p . 

Example 4.7 (graphical derivative is strictly smaller than i?-subdifferential and 
generalized Jacobian). Consider the simplest nonsmooth convex function H(x) = \x\ on 
M. In this case d B H(0) = {-1, 1} and d c H(0) = [-1, 1]. Thus 

d B H(0)z = {-1,1} and d c H(0)z = [-1, 1] for z = 1. 

Since H(x) = \x\ is locally Lipschitzian and directionally differentiable, we have 

DH(0)(z) = {H'(0;z)} = \z\ = {1} for z = 1. 

Hence it gives the relationships 

DH(0)(z) = co{DH(0)(z)} C d B H(0)z C d c H(0)z, 

where both inclusions are strict. Observe also the difference between the convexification of 
the graphical derivative and of the coderivative; in the latter case we have equality (|2.8p . 



As mentioned in Section 1, there is an improvement [30] of the iterative procedure (|4,10p 
with the replacement the generalized Jacobian therein by the S-subdifferential 

(4.11) - H(x k ) e d B H(x k )d k , k = 0,1,2,... . 

Note that, along with obvious advantages of version (|4.1ip over the one in (|4.10p . in some 
settings it is easier to deal with the generalized Jacobian than with its S-subdifferential 
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counterpart due to much better calculus and convenient representations for dcH(x) in 
comparison with the case of 8bH(x), which does not even reduce to the classical subdif- 
ferential of convex analysis for simple convex functions as, e.g., H{x) = \x\. A remarkable 
common feature for both versions in (|4.10p and (|4.11j) is the efficient semismoothness as- 
sumption imposed on the underlying mapping H to ensure its local superlinear convergence. 
This assumption, which unifies and labels versions (|4.10p and (|4.1ip as the "semismooth 
Newton method", is replaced in our generalized Newton method by assumption (H2). Let 
us now recall the notion of semismoothness and compare it with (H2). 

A mapping H : ]R n — > M m , locally Lipschitzian and directionally differentiable around 
x, is semismooth at this point if the limit 

(4.12) lim {Ah} 

h^z, tlO 1 1 

Aea c H(x+th) 

exists for all z G M n ; see [U Definition 7.4.2]. This notion was introduced in [20] for 
real-valued functions and then extended in [31 j to the vector mappings for the purpose of 
applications to a nonsmooth Newton's method. It is not hard to check [61\ Proposition 2.1] 
that the existence of the limit in (|4.12p implies the directional differentiability of H at x 
(but may not around this point) with 

H'(x;z)= lim {Ah} for all z G M n . 

h^z, tj.0 1 1 
Aed c H(x+th) 

One of the most useful properties of semismooth mappings is the following representation 
for them obtained in [291 Proposition 1]: 

(4.13) \\H{x + z) - H{x) - Az\\= o{\\z\\) for all z^O and Ae d c H(x + z), 
which we exploit now to relate semismoothness to our assumption (H2). 

Proposition 4.8 (semismoothness implies assumption (H2)). Let H : M n — > lR m be 

semismooth at x. Then assumption (H2) is satisfied. 

Proof. Since any semismooth mapping is Lipschitz continuous on a neighborhood U of x, 
we have by Proposition 12.2( c) that 

DH(x)(x -x) = DH(x)(x - x) for all x € U. 

Proposition 14.61 yields therefore that 

DH(x)(x — x) C dcH{x){x — x) whenever x G U. 

Given any v G DH{x){x — x) and using the latter inclusion, find a matrix A G dcH{x) such 
that v = A(x — x). Applying finally property (|4.13p of semismooth mappings, we get 

\\H(x) - H{x) + v\\ = \\H(x) - H(x) - A(x - x)\\ = o(\\x - x\\) for all x G U, 

which thus verifies (H2) and completes the proof of the proposition. A 

Note that the previous proposition actually shows that condition (|4.13p implies (H2). 
The next result states that the converse is also true, i.e., we have that assumption (H2) is 
completely equivalent to (|4.13p for locally Lipschitzian mappings. 
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Proposition 4.9 (equivalent description of (H2)). Let H : lR n — > M m be locally Lip- 
schitzian around x, and let assumption (H2) hold with some neighborhood U therein. Then 

(4.14) \\H(x) - H(x) - A(x -x)\\= o{\\x - x\\) for all xeU and A G d B H(x). 

Therefore assumption (H2) is equivalent to (|4.13p . 

Proof. Arguing by contradiction, suppose that (|4.14p is violated and find sequences x k — > x, 
Aj. G dsH{xk) and a constant 7 > such that 



By the Lipschitz property of H and by construction (|4.2|) of the £>-subdifferential there are 
points of differentiability u k G Sh close to x k with H'(v,}.) sufficiently close to A k satisfying 

\\H(u k ) - H(x) - H'(u k )(u k - x)\\ > %\\x- u k \\, k G IN. 

Then Proposition I2.2l fc.f ) gives us the representations 

DH(u k )(x - u k ) = DH(u k )(x - u k ) = -H'(u k )(u k - x) 

for all k G IN, which imply therefore that 

\\H(u k ) — H(x) + v\\ > %\\x— u k \\ whenever v G DH{u k ){x — u k ), k G IN. 

This clearly contradicts assumption (H2) for k sufficiently large and thus ensures prop- 
erty (|4.14p . The equivalence between (H2) and (|4.13p follows now from the implication 
(H2)^>(|H1]) and the proof of Proposition SSI A 

It is well known that, for the class of locally Lipschitzian and directionally differentiable 
mappings, condition (|4.13j) is equivalent to the original definition of semismoothness; see, 
e.g., [8] Theorem 7.4.3]. Proposition 14.91 above establishes the equivalence of (|4.13|) to our 
major assumption (H2) provided that H is locally Lipschitzian around the reference point 
while it may not be directionally differentiable therein. In fact, it follows from Example l4.11l 
that assumption (H2) may hold for locally Lipschitzian functions, which are not directionally 
differentiable and hence not semismooth. Let us now illustrate that (H2) may also be 
satisfied for non-Lipschitzian mappings, in which case it is not equivalent to property (I4.13p . 

Example 4.10 (assumption (H2) holds for non-Lipschitzian one-to-one map- 
pings). Consider the mapping H: M 2 — > M 2 defined by 



It is easy to check that this mapping is one-to-one around (0,0). Focusing for definiteness 
on the nonnegative branch of the mapping H, observe that at any point (x±, X2) G 1R 2 with 
either x±,X2 > 0, the classical Jacobian JH(xi, X2) is computed by 



H{x k ) - H(x) - A k (x k -x)\\ > 7||x - x k 



k G IV. 



(4.15) 





2y/xi + x'2 
1 
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Setting x\ = x|, we see that the first component 

X2 X2 



1-sJ x\ + x\ 1\Jx\ + x\ 

is unbounded when xi,X2 4 0. This implies that the Jacobian JH(x\,X2) is unbounded 
around (xi,X2) = (0,0), and hence H is not locally Lipschitzian around the origin. 

Let us finally verify that the underlying assumption (H2) is satisfied for the mapping H 
in (I4.15p . First assume that xi,X2 > 0. Then we need to check that 

\\H(xi,x 2 ) - H(x 1 ,x 2 ) + JH(x 1 ,x 2 )(-xi, -x 2 )\\ 



X1X2 I o 3x 



X2\JXl +x\ - X2\JXl + x\ 



4 



2^/ x\ + x 2 2-y/xi + x\ 



X\X2 3X2 



The latter surely holds as (xi,x 2 ) — > (0,0) due to the estimates 

X1X2 X\ 

< , ., < y/Xi, 



2y/xi + x\^Jx{ + x\ ^Jx\ + x\ 



< ; 2 5 < 3x 2 , 



2i/xi + x\^J x\ + x\ 2y/ x\ + x\ 
which thus justify the fulfillment of assumption (H2) in this case. The other cases where 
x\ > 0, X2 < or xi < 0, X2 > or x\ < 0, X2 < or, finally, x\ = 0, X2 arbitrary (here H 
is not differentiable) can be treated in a similar way. 

To complete our discussion on the major assumptions in this section, let us present 
an example of a locally Lipschitzian function, which satisfies assumptions (HI) and (H2) 
being locally one-to-one and metrically regular around the point in question while not being 
directionally differentiable and hence not semismooth at this point. 

Example 4.11 (non-semismooth but metrically regular, Lipschitzian, and one- 
to-one functions satisfying (HI) and (H2)). We construct a function H : [—1, 1] — > 1R 
in the following way. First set H{x) := at x = 0. Then define H on the interval (1/2, 1] 
staying between two lines 

(l-^)x + i <^(x) <x 

in the following way: start from (1, 1) and let H be continuous piecewise linear when x 
goes from 1 to 1/2 with the slope 1+1/4 and then with the slope 1/2 — 1/4 alternatively 
until x reaches 1/2. Consider further each interval (2~ fe , 2~( fc ~ 1 )] for k = 2,3,... and, 
starting from the point (2~^ fc_1 \ 2~( fc_1 )) , define H to be continuous piecewise linear with 
the corresponding slopes of either 1 + 2~ 2k or 1 — 2™ fc — 2~ 2k staying between the two lines 

(4.16) (\_i_) x+ _L<#(x)<x. 

Thus we have constructed H on the whole interval [0, 1]; see Figure Q] for illustration. On 
the interval [—1,0], define the function H symmetrically with respect to the origin. Then 
it is easy to see that H in continuous on [—1, 1] and satisfies the following properties: 
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H is clearly Lipschitz continuous around x = 0. 

Since H is continuous and monotone with a positive uniform slope, it is one-to-one 
and metrically regular around x, which directly follows, e.g., from the coderivative 
criterion (j2.12[l . This ensures the fulfillment of assumption (HI) by Proposition 14.41 

To verify assumption (H2), fix k G IN and x G (2~ fc , 2~( fe ~ 1 )] and then pick any 

1 



v G DH(x)(x — x) C 
Since x = 0, the latter implies that 



2k 2 2k ' 2 2fc 



x — x . 



l + ^)x<v<[l 2k 22kJ 

Thus we have by (14,16|) and simple computations that 

\H(x) - H(x) + v\< —\ x \ + — + — = o^= o{\x - x|), 

which shows that assumption (H2) is satisfied. In fact, it follows from above that the 
latter value is 0(2~ 2fe ) = 0(\\x - x\\ 2 ). 

Let us finally check that H is not directionally differentiable at x/% = 2~ k for any 
k G IN; therefore it is not directionally differentiable around the reference point x = 
and hence not semismooth at x. Indeed, this follows directly from computing the 
graphical derivative by 



DH{x k ){l) 



k G IV, 



which is not single- valued at x^, and thus H is not directionally differentiable at x& 
due to Proposition 12. 2f c.d). 



5 Application to the 5-differentiable Newton Method 

In this section we present applications of the graphical derivate-based generalized Newton 
method developed above to the £>-differentiable Newton method for nonsmooth equations 
(jl.ip originated by Pang [27] . 

Throughout this section, suppose that H : M n —> M n is locally Lipschitzian and direc- 
tionally differentiable around the reference solution x to (jl.ip . Proposition I2.2( c.d) yields 
in this setting that the generalized Newton equation (|3.ip in our Algorithm 13.11 reduces to 

(5.1) -H{x k ) =H'(x k ;d k ) 

with respect to the new search direction d k and that the new iterate x k+l is computed by 

(5.2) x k+1 :=x k + d k , k = 0,1,2,.... 
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1/4 



1/2 



Figure 1: Construction of the mapping from Example 14.111 Illustration 



Note that Pang's i?-differentiable Newton method and its further developments (see, 
e.g., [HI El [28j EOl [31]) are based on Robinson's notion of the £>(ouligand)-derivative [32] 
for nonsmooth mappings; hence the name. As was then shown in [37], the S-derivative 
of a locally Lipschitzian mapping agrees with the classical directional derivative. Thus 
the iteration scheme in Pang's S-differentiable method reduces to (|5.ip and (|5.2p in the 
Lipschitzian and directionally differentiable case, and so we keep the original name of |27] , 

The next theorem shows what we get from applying our local convergence result from 
Theorem l3.3l and the subsequent analysis developed in Sections 3 and 4 to the S-differentiable 
Newton method. This theorem employs an equivalent description of assumption (H2) held 
in the setting under consideration and the coderivative criterion f)2 . 12|) for metric regularity 
of the underlying Lipschitzian mapping H ensuring the validity of assumption (HI). 

Theorem 5.1 (solvability and local convergence of the B-differentiable Newton 
method via metric regularity). Let H : M n — > M n be semismooth, one-to-one, and 
metrically regular around a reference solution x to (jl.ip . i.e., 



Then the B -differentiable Newton method (|5.ip . (|5.2p is well defined {meaning that equation 



(5.3) 
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(|5.ip is solvable for d k as k E IV) and converges at least superlinearly to the solution x. 

Proof. Since H is locally Lipschitzian around x, the coderivative criterion (|2.12|) is equiv- 
alently written in form (|5.3f) via the limiting subdifferential (|2.10p due to the scalarization 
formula fj2.9[) . Applying Theorem 13,31 to the S-differentiable Newton method, we need to 
check that assumptions (HI) and (H2) are satisfied in the setting under consideration. In- 
deed, it follows from Proposition 14.91 and the discussion right after it that (H2) is equivalent 
to the semismoothness for locally Lipschitzian and directionally differentiable mappings. 
The fulfillment of assumption (HI) is guaranteed by Proposition 14.41 A 

More specific sufficient conditions for the well-posedness and superlinear convergence of 
the .B-differentiable Newton method are formulated via of the Thibault derivative ()4.4p . 

Corollary 5.2 (B-differentiable Newton method via Thibault's derivative). Let 

H : M n — > M n be semismooth at the reference solution point x of equation ([l.ljl . and let 
condition (14. 6ft be satisfied. Then the B- subdifferential Newton method (|5.1|) . (15.21) is well 
defined and converges superlinearly to the solution x. 

Proof. Follows from Theorem 1 5 . 1 1 and Proposition 14.51 A 

Observe by the second inclusion in (|4.5I) that the assumptions of Corollary 15.21 are 
satisfied when all the matrices from the generalized Jacobian dcH{x) are nonsingular. In 
the latter case the solvability of subproblem (15. ip and the superlinear convergence of the B- 
differentiable Newton method follow from the results of [31] that in turn improve the original 
ones in [27], where H is assumed to be strongly Frechet differentiable at the solution point. 

Further, it is shown in [30] that the 5-differentiable method for semismooth equations 
(jl.ip converges superlinearly to the solution x if just matrices A G 8bH{x) are nonsingular 
while assuming in addition that subproblem (|5.ip is solvable. As illustrated by the example 
presented on pp. 243-244 of [30], without the latter assumption the .B-differentiable Newton 
method may not be well defined for semismooth mappings H on the plane with all the 
nonsingular matrices from dBH(x). We want to emphasize that the solvability assumption 
for (15. ip is not imposed in Theorem 15. U — it is ensured by metric regularity. 

Let us now discuss interconnections between the metric regularity property of locally 
Lipschitzian mappings H : M n — > IR n via its coderivative characterization (|5.3p and the 
nonsingularity of the generalized Jacobian and i?-subdifferential of H at the reference point. 
To this end, observe the following relationships between the corresponding constructions. 

Proposition 5.3 (relationships between the i3-subdifFerential, generalized Jaco- 
bian, and coderivative of Lipschitzian mappings). Let H : M n — > M m be locally 
Lipschitzian around x. Then we have 

(5.4) d B H{x) T z C d(z, H){x) C d c H(x) T z for all z e JR m , 

where both inclusions in (|5.4p are generally strict. 

Proof. Recall that the middle term in (|5.4p expressed via the limiting subdifferential 
(|2.10p is exactly the coderivative D* H(x)(z) due to the scalarization formula (|2.9p for 
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locally Lipschitzian mappings. Thus the second inclusion in (|5.4p follows immediately from 
the well-known equality (|2.8p involving convexification, and it is strict as a rule due to the 
usual nonconvexity of the limiting subdifferential; see [241 [35] . 

To justify the first inclusion in (|5.4p . observe that the limiting subdifferential 8f(x) of 
every function / : M n — > M continuous around x admits the representation 

(5.5) df(x) = Limsup9/(x) 

x— >x 

via the outer limit (jl.6p of the Frechet /regular subdifferentials 

(5.6) df(x) := {p G M n 



liminf /(n)-/(x)-( P ,n-x) >Q i 



of / at x; see, e.g., [MJ Theorem 1.89]. We obviously have from (|5.6p that 8f(x) = {f'(x)} 
if / is (Frechet) differentiable at x with its derivative/gradient f'(x). 

Having the mapping H = (hi, . . . , h m ) : M n —> JR m in the proposition and fixing an 
arbitrary vector z = (z\, . . . , z m ) G M m , form now a scalar function / 2 : M n — > M by 

m 

(5.7) / g (x) :=^A(*), xGffi". 

i=i 



Then the first inclusion in (15.4p amounts to say that 
(5.8) d B H(x) T zCdf- z (x). 

To proceed with proving (]5.8|) . pick any matrix A G 8bH(x) t z and denote by otj G JR n , 
i = 1, ... ,n, its vector rows. By definition (|4.2p of the S-subdifferential 8bH(x) there is a 
sequence {x k } C 5// from the set of differentiability (14. 1|) such that x k — > x and H'(x k ) — > A 
as k — y oo. It is clear from (|5.7p that the function f s is differentiable at each x k with 



fs ( xh ) = z iK( xk ) Ziai = ^ z as ^ ~^ °° ■ 
1=1 1=1 

Since dfz(x k ) = {fz(x k )} at all the points of differentiability, we arrive at (|5.8p by repre- 
sentation (|5.5p of the limiting subdifferential and thus justify the first inclusion in (|5.4p . 

To illustrate that the latter inclusion may be strict, consider the function H(x) := \x\ 
on JR. Then 8bH(0)z = {—z, z} for all z G M, while 



d(zH)(0) = D*H(0)(z) 



[—z, z] for z > 0, 
{— z, z} for z < 0. 



This completes the proof of the proposition. A 

It follows from Proposition 15. 31 in the case of Lipschitzian transformations H : lR n — > M n 
that the nonsingularity of all the matrices A G dcH(x) is a sufficient condition for the metric 
regularity of H around x due to the coderivative criterion (|5.3p while the nonsingularity 
of all A G dsH(x) is a necessary condition for this property. Note however, as it has 
been discussed above, that the nonsingularity condition for 8bH(x) alone does not ensure 
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the solvability of subproblem (|5.ip in the B-differentiable Newton method, and thus it 
cannot be used alone for the justification of algorithm (|5.ip . (|5.2p in the 5-differentiable 
semismooth case. Furthermore, we are not familiar with any verifiable condition to support 
the nonsingularity of 8bH(x) in the full justification of the i3-differentiable Newton method. 

In contrast to this, the metric regularity itself — via its verifiable pointwise characteri- 
zation (|5.3p — ensures the solvability of (|5.ip and fully justifies the B-differentiable Newton 
method with its superlinear convergence provided that the mapping H is semismooth and 
locally invertible around the reference solution point. Note that the nonsingularity of the 
generalized Jacobian dcH(x) implies not only the metric regularity but simultaneously 
the semismoothness and local invertibility of a Lipschitzian transformation H : M n —> M n . 
However, the latter condition fails to spot a number of important situations when all the as- 
sumptions of Theorem 15.11 are satisfied; see, in particular, Corollary 15.21 and the correspond- 
ing conditions in terms of Warga's derivate containers discussed right after Corollary 14.51 
We refer the reader to the specific mappings H : M 2 — > M 2 from \17\ Example 2.2] and |40t 
Example 3.3] that can be used to illustrate the above statement. 

6 Concluding Remarks 

In this paper we develop a new generalized Newton method for solving systems of nons- 
mooth equations H(x) = with H : M n — > JR n . Local superlinear convergence and global 
(of the Kantorovich type) convergence results are derived under relatively mild conditions. 
In particular, the local Lipschitz continuity and directional differentiability of H are not 
necessarily required. We show that the new method and its specifications have some advan- 
tages in comparison with previously known results on the semismooth and i?-differentiable 
versions of the generalized Newton method for nonsmooth Lipschitz equations. 

Our approach is heavily based on advanced tools of variational analysis and generalized 
differentiation. The algorithm itself is built by using the graphical/contingent derivative 
of H, while other graphical derivatives and coderivatives are employed in formulating ap- 
propriate assumptions and proving solvability and convergence results. The fundamental 
property of metric regularity and its pointwise coderivative characterization play a crucial 
role in the justification of the algorithm and its satisfactory performance. 

In the other lines of developments, it seems appealing to develop an alternative Newton- 
type algorithm, which is constructed by using the basic coderivative instead of the graph- 
ical derivative. This requires certain symmetry assumptions for the given problem, since 
the coderivative is an extension of the adjoint derivative operator. Major advantages of 
a coderivative-based Newton method would be comprehensive calculus rules held for the 
coderivative in contrast to the contingent derivative, complete coderivative characterizations 
of Lipschitzian stability, and explicit calculations of the coderivative in a number of settings 
important for applications. The details of these ideas are part of our future research. 
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